On the Training Instability of Shuffling SGD with Batch Normalization.
|
ICML |
2023 |
0 |
Global optimality for Euclidean CCCP under Riemannian convexity.
|
ICML |
2023 |
0 |
The Crucial Role of Normalization in Sharpness-Aware Minimization.
|
NIPS/NeurIPS |
2023 |
0 |
Transformers learn to implement preconditioned gradient descent for in-context learning.
|
NIPS/NeurIPS |
2023 |
0 |
Sign and Basis Invariant Networks for Spectral Graph Representation Learning.
|
ICLR |
2023 |
0 |
CCCP is Frank-Wolfe in disguise.
|
NIPS/NeurIPS |
2022 |
1 |
Understanding the unstable convergence of gradient descent.
|
ICML |
2022 |
16 |
Max-Margin Contrastive Learning.
|
AAAI |
2022 |
0 |
Beyond Worst-Case Analysis in Stochastic Approximation: Moment Estimation Improves Instance Complexity.
|
ICML |
2022 |
0 |
Neural Network Weights Do Not Converge to Stationary Points: An Invariant Measure Perspective.
|
ICML |
2022 |
0 |
Efficient Sampling on Riemannian Manifolds via Langevin MCMC.
|
NIPS/NeurIPS |
2022 |
0 |
Minibatch vs Local SGD with Shuffling: Tight Convergence Bounds and Beyond.
|
ICLR |
2022 |
0 |
Understanding Riemannian Acceleration via a Proximal Extragradient Framework.
|
COLT |
2022 |
0 |
Can contrastive learning avoid shortcut solutions?
|
NIPS/NeurIPS |
2021 |
44 |
Three Operator Splitting with Subgradients, Stochastic Gradients, and Adaptive Learning Rates.
|
NIPS/NeurIPS |
2021 |
2 |
Provably Efficient Algorithms for Multi-Objective Competitive RL.
|
ICML |
2021 |
13 |
Three Operator Splitting with a Nonconvex Loss Function.
|
ICML |
2021 |
4 |
Online Learning in Unknown Markov Games.
|
ICML |
2021 |
30 |
Coping with Label Shift via Distributionally Robust Optimisation.
|
ICLR |
2021 |
0 |
Contrastive Learning with Hard Negative Samples.
|
ICLR |
2021 |
0 |
Open Problem: Can Single-Shuffle SGD be Better than Reshuffling SGD and GD?
|
COLT |
2021 |
0 |
SGD with shuffling: optimal rates without component convexity and large epoch requirements.
|
NIPS/NeurIPS |
2020 |
26 |
Towards Minimax Optimal Reinforcement Learning in Factored Markov Decision Processes.
|
NIPS/NeurIPS |
2020 |
21 |
Why are Adaptive Methods Good for Attention Models?
|
NIPS/NeurIPS |
2020 |
91 |
Strength from Weakness: Fast Learning Using Weak Supervision.
|
ICML |
2020 |
23 |
From Nesterov's Estimate Sequence to Riemannian Acceleration.
|
COLT |
2020 |
50 |
Geodesically-convex optimization for averaging partially observed covariance matrices.
|
ACML |
2020 |
2 |
Complexity of Finding Stationary Points of Nonconvex Nonsmooth Functions.
|
ICML |
2020 |
19 |
Learning Adversarial Markov Decision Processes with Bandit Feedback and Unknown Transition.
|
ICML |
2020 |
62 |
Why Gradient Clipping Accelerates Training: A Theoretical Justification for Adaptivity.
|
ICLR |
2020 |
0 |
Conditional Gradient Methods via Stochastic Path-Integrated Differential Estimator.
|
ICML |
2019 |
37 |
Flexible Modeling of Diversity with Strongly Log-Concave Distributions.
|
NIPS/NeurIPS |
2019 |
10 |
Escaping Saddle Points with Adaptive Gradient Methods.
|
ICML |
2019 |
59 |
Are deep ResNets provably better than linear predictors?
|
NIPS/NeurIPS |
2019 |
11 |
Random Shuffling Beats SGD after Finite Epochs.
|
ICML |
2019 |
0 |
Learning Determinantal Point Processes by Corrective Negative Sampling.
|
AISTATS |
2019 |
0 |
Small ReLU networks are powerful memorizers: a tight analysis of memorization capacity.
|
NIPS/NeurIPS |
2019 |
0 |
Direct Runge-Kutta Discretization Achieves Acceleration.
|
NIPS/NeurIPS |
2018 |
92 |
Exponentiated Strongly Rayleigh Distributions.
|
NIPS/NeurIPS |
2018 |
12 |
An Estimate Sequence for Geodesically Convex Optimization.
|
COLT |
2018 |
43 |
Non-Linear Temporal Subspace Representations for Activity Recognition.
|
CVPR |
2018 |
39 |
Modular Proximal Optimization for Multidimensional Total-Variation Regularization.
|
JMLR |
2018 |
0 |
A Generic Approach for Escaping Saddle points.
|
AISTATS |
2018 |
0 |
Elementary Symmetric Polynomials for Optimal Experimental Design.
|
NIPS/NeurIPS |
2017 |
18 |
Polynomial time algorithms for dual volume sampling.
|
NIPS/NeurIPS |
2017 |
31 |
Combinatorial Topic Models using Small-Variance Asymptotics.
|
AISTATS |
2017 |
0 |
Proximal Stochastic Methods for Nonsmooth Nonconvex Finite-Sum Optimization.
|
NIPS/NeurIPS |
2016 |
155 |
Kronecker Determinantal Point Processes.
|
NIPS/NeurIPS |
2016 |
26 |
Stochastic Variance Reduction for Nonconvex Optimization.
|
ICML |
2016 |
503 |
First-order Methods for Geodesically Convex Optimization.
|
COLT |
2016 |
205 |
Geometric Mean Metric Learning.
|
ICML |
2016 |
136 |
Riemannian SVRG: Fast Stochastic Optimization on Riemannian Manifolds.
|
NIPS/NeurIPS |
2016 |
169 |
Fast DPP Sampling for Nystrom with Application to Kernel Methods.
|
ICML |
2016 |
73 |
AdaDelay: Delay Adaptive Distributed Stochastic Optimization.
|
AISTATS |
2016 |
31 |
Fast Mixing Markov Chains for Strongly Rayleigh Measures, DPPs, and Constrained Sampling.
|
NIPS/NeurIPS |
2016 |
33 |
Gaussian quadrature for matrix inverse forms with applications.
|
ICML |
2016 |
0 |
Parallel and Distributed Block-Coordinate Frank-Wolfe Algorithms.
|
ICML |
2016 |
0 |
Efficient Sampling for k-Determinantal Point Processes.
|
AISTATS |
2016 |
0 |
Fixed-point algorithms for learning determinantal point processes.
|
ICML |
2015 |
44 |
Matrix Manifold Optimization for Gaussian Mixtures.
|
NIPS/NeurIPS |
2015 |
73 |
Data modeling with the elliptical gamma distribution.
|
AISTATS |
2015 |
6 |
On Variance Reduction in Stochastic Gradient Descent and its Asynchronous Variants.
|
NIPS/NeurIPS |
2015 |
191 |
Large-scale randomized-coordinate descent methods with non-separable linear constraints.
|
UAI |
2015 |
0 |
Efficient Structured Matrix Rank Minimization.
|
NIPS/NeurIPS |
2014 |
19 |
Towards an optimal stochastic alternating direction method of multipliers.
|
ICML |
2014 |
59 |
Randomized Nonlinear Component Analysis.
|
ICML |
2014 |
165 |
Riemannian Sparse Coding for Positive Definite Matrices.
|
ECCV |
2014 |
55 |
Fast Newton methods for the group fused lasso.
|
UAI |
2014 |
18 |
Geometric optimisation on positive definite matrices for elliptically contoured distributions.
|
NIPS/NeurIPS |
2013 |
28 |
Jensen-Bregman LogDet Divergence with Application to Efficient Similarity Search for Covariance Matrices.
|
TPAMI |
2013 |
157 |
Reflection methods for user-friendly submodular optimization.
|
NIPS/NeurIPS |
2013 |
76 |
Fast projections onto mixed-norm balls with applications.
|
DMKD |
2012 |
28 |
Scalable nonconvex inexact proximal splitting.
|
NIPS/NeurIPS |
2012 |
64 |
A new metric on the manifold of kernel matrices with application to matrix geometric means.
|
NIPS/NeurIPS |
2012 |
133 |
Fast Newton-type Methods for Total Variation Regularization.
|
ICML |
2011 |
87 |
Fast Projections onto ℓ1, q -Norm Balls for Grouped Feature Selection.
|
ECML/PKDD |
2011 |
39 |
Generalized Dictionary Learning for Symmetric Positive Definite Matrices with Application to Nearest Neighbor Retrieval.
|
ECML/PKDD |
2011 |
50 |
Efficient similarity search for covariance matrices via the Jensen-Bregman LogDet Divergence.
|
ICCV |
2011 |
77 |
A scalable trust-region algorithm with application to mixed-norm regression.
|
ICML |
2010 |
41 |
Efficient filter flow for space-variant multiframe blind deconvolution.
|
CVPR |
2010 |
232 |
Convex Perturbations for Scalable Semidefinite Programming.
|
AISTATS |
2009 |
9 |
Workshop summary: Numerical mathematics in machine learning.
|
ICML |
2009 |
0 |
Block-Iterative Algorithms for Non-negative Matrix Approximation.
|
ICDM |
2008 |
6 |
Fast Newton-type Methods for the Least Squares Nonnegative Matrix Approximation Problem.
|
SDM |
2007 |
139 |
Information-theoretic metric learning.
|
ICML |
2007 |
0 |
Incremental Aspect Models for Mining Document Streams.
|
ECML/PKDD |
2006 |
18 |
Efficient Large Scale Linear Programming Support Vector Machines.
|
ECML/PKDD |
2006 |
19 |
Generalized Nonnegative Matrix Approximations with Bregman Divergences.
|
NIPS/NeurIPS |
2005 |
479 |
Clustering on the Unit Hypersphere using von Mises-Fisher Distributions.
|
JMLR |
2005 |
874 |
Triangle Fixing Algorithms for the Metric Nearness Problem.
|
NIPS/NeurIPS |
2004 |
19 |
Minimum Sum-Squared Residue Co-Clustering of Gene Expression Data.
|
SDM |
2004 |
327 |
Generative model-based clustering of directional data.
|
KDD |
2003 |
122 |